A validation study of data in the National Tonsil Surgery Register in Sweden: high agreement with medical records ensures that data can be used to monitor clinical practices and outcomes

Background The ambition of the National Tonsil Surgery Register in Sweden (NTSRS) is to improve otorhinolaryngological care by monitoring trends in the clinical practices, complications, and outcomes of tonsil surgery. The NTSRS collects data from both surgeons and patients and provides the participating clinics with daily updated data on a publicly available website. On the website, national and local results can be compared and monitored. The use of NTSRS data necessitates that the data is valid, but the NTSRS has not yet been validated. With approximately half of the registered patients responding to the postoperative questionnaires, an analysis of responders and non-responders is also necessary. The aim of this study was to assess the criterion validity of NTSRS data. Another aim was to compare the characteristics and rates of complications between postoperative questionnaire responders and non-responders. Methods Data in the NTSRS were compared with data in electronic medical records. The 200 most recent surgeries, up to 31 Dec 2019, in each of 11 surgical units were included. Criterion validity was analysed in terms of observed agreement, Cohens kappa, Gwet’s AC1, and positive and negative agreement. The sign test was used to analyse systematic differences between the NTSRS and the medical records. Comparisons of rates between groups were made with Fisher’s exact test, the chi-square test, and Fisher’s non-parametric permutation test. Results A total of 1991 registrations were included in the study. All variables showed very high observed agreement ranging from 0.91 to 1.00, and all variables had AC1 values corresponding to almost perfect agreement. The analysis of questionnaire responders and non-responders showed no statistically significant differences regarding age, indication, or type of surgery. The proportion of women was higher in the responder group. The rate of reoperation due to bleeding was higher in the responder group, but there were no differences regarding other complications. Conclusions The results of this study show that data in the NTSRS have criterion validity. The NTSRS is thus well suited for monitoring the clinical practices and outcomes of tonsil surgery. The quality of the data also implies that the registry can be used in both clinical improvement projects and research.


Background
The National Tonsil Surgery Register in Sweden (NTSRS) was established in 1997 by the Swedish Association for Otorhinolaryngology Head and Neck Surgery. The aim was to improve Ear-, nose-and throat (ENT) care by monitoring trends in the clinical practices, complications, and outcomes of tonsil surgery. The NTSRS aims to include all tonsil surgeries performed for benign indications. The NTSRS was revised to the current version in 2009. Participation in the NTSRS is voluntary, with a majority (82% in 2019) of Swedish surgical centres reporting to the register. A comparison with the Swedish National Patient Register (NPR) showed that in the last decade, approximately 80% of all tonsil surgeries in Sweden have been included [1]. All public and private healthcare providers in Sweden are obliged by law to report the number of performed tonsil surgeries to the NPR. By the end of 2019, > 100,000 surgeries had been registered since the revision in 2009. In the NTSRS, the surgeon registers a perioperative form with information on age, sex, indication, surgical method, surgical technique for dissection and haemostasis, and postoperative bleeding occurring during the hospital stay. At 30 days and at 6 months after surgery, the patient is requested to fill in questionnaires with patient-reported outcome measures (PROMs). The 30-day questionnaire focuses on postoperative complications and recovery, and the 6-month questionnaire focuses on the success of the surgery (symptom relief ). The mean response rate from 2009 to 2018 was 55% for the 30-day questionnaire and 46% for the 6-month questionnaire. A detailed description of the variables and the data collection procedure is available in English at the NTSRS website [1]. The NTSRS can be used as a benchmark to identify differences in clinical outcomes between surgical units and evaluate the effect of quality improvement projects. The registry provides the participating clinics with daily updated data, both raw and processed, via a publicly available website [2]. The results can thus be monitored longitudinally and compared with those of other surgical units. The register should function as a stimulator of competition to achieve best practice [3]. A health care register, such as the NTSRS, does also provide data to measure value in health care [4] by monitoring the most important outcomes of tonsil surgery. Data from the NTSRS have been used in several scientific publications to describe clinical practices, compare surgical techniques, and evaluate postoperative complications [5][6][7][8][9][10]. NTSRS data have also been used in clinical quality improvement projects [11]. Such use of health care register data necessitates that the data be reliable and valid [12,13], but the NTSRS has not yet been validated. Thus, the NTSRS needs to be validated in a large-scale, multicentre, systematically conducted study.
A method widely used in quality register validation is to compare register data with expert reviewed data from medical records [14,15]. This is also the recommended method for validation according to the Swedish Association of Local Authorities and Regions, funders of the Swedish quality registers [16]. The results of such an audit of a health register should both be published to substantiate register quality, and used to improve data quality in the register [17]. There are only two national tonsil surgery registers in the world, the Swedish and the Norwegian. The Norwegian register was launched in 2017 and structured as a copy of the NTSRS. In 2019, The Norwegian Tonsil Surgery Register published the results of an agreement analysis (including analysis of observed agreement, Cohen's kappa and Gwet's AC1 coefficients) of the register database and medical records [15]. This study showed almost perfect agreement but the study was a single-centre study, with a limited number of patients, and only variables related to the surgery were analysed. It would strengthen both tonsil surgery registers if the results could be reproduced in a multi-centre study, with a substantially larger study population also including outcome PROM data. With approximately half of the registered patients responding to the PROM questionnaires, an analysis of responders and non-responders regarding both general characteristics and outcomes is also necessary.
The aim of this study was to evaluate the criterion validity of data in the NTSRS by comparison with data in medical records. Other aims were to gain a better understanding of potential weaknesses regarding the validity of NTSRS variables, and to compare the characteristics and rates of complications in patients who did and did not respond to the 30-day postoperative PROM questionnaire.

Study design
A retrospective comparison of data in the NTSRS and data in electronic medical records (EMRs) was conducted for the 200 most recently performed surgeries, up to 31 Dec 2019, in each of 11 surgical units. The set goal of 2200 patients at 11 surgical centres was decided with three intentions. The first intention was to include a composition of surgical centres representative of the NTSRS, the second intention was to include enough patients at each centre to ensure generalizability, and the third intention was to capture rare events. As clinical practice constantly evolves, we decided to include the 200 most recent surgeries at each unit rather than a random sample to assure that the data analysed in this study reflected today clinical practice and data quality. The choice to include the most recent surgeries rather than a random sample made it impossible, beforehand, to ascertain that the case mix of the studied cohort would be perfectly matched with the total NTSRS cohort. This was a calculated risk, but based on our professional knowledge of Swedish tonsil surgery practice and our choice to include a wide variety of surgical units we assumed that the case mix of the study cohort would be satisfactory comparable with the NTSRS cohort. In 2019, seven university clinics, 20 county hospitals, 10 rural hospitals and 15 private units participated in the NTSRS. With three university clinics, four county hospitals, two rural hospitals and two small private units included in this study, the sample was considered a good representation of Swedish tonsil surgical units. It was acknowledged in the planning phase that some of the data in the NTSRS might be difficult to validate against medical records as many NTSRS variables are not registered in Swedish EMRs. Therefore some of the NTSRS variables had to be left out of the study. Two examples were "For how many days after surgery did you take painkillers?" and "How many days after surgery did you start eating regular food?". The steering committee of the NTSRS chose the variables of the study based on professional knowledge of what type of information could be expected to be found in Swedish medical records.
The selected variables with definitions are presented in Table 1.
There was only one exclusion criterion for the study: patients were excluded if they had a permanent residence outside the catchment area (i.e. health care region) of the surgical unit. The reason was that information on postoperative complications would be impossible to find in the medical records used by the operating clinic, as there is no national medical record system in Sweden. In Sweden, health care is provided by 21 health care regions. The EMRs used for data retrieval in this study were common for the respective health care region. This means that the EMRs covered provided care in quite large geographical areas, including all hospitals and all ENT departments belonging to the health care region in question. Primary care EMRs were not used for data retrieval in this study. For PROM data analysis, only patients who responded to the 30-day questionnaire were included.
Previous studies based on the NTSRS have shown that hot (electrosurgical) surgical techniques, used either for dissection or for haemostasis, carry a higher risk for postoperative bleeding after tonsillectomy [7,8]. Cold (cold steel instruments) techniques have the lowest risk for bleeding, and there is no difference in risk between different electrosurgical techniques [7,8]. Therefore, the NTSRS decided to present composite data with all techniques reclassified as cold or hot on the NTSRS website. Therefore, surgical techniques were, in this study, dichotomized as either cold (cold dissection with cold haemostasis) or not cold.

Data collection
The source data in this study was the medical records (which in all cases were electronic medical records) and the NTSRS database. A duplicate of the NTSRS database (the validation database -VDB) was constructed to record the data from the medical records. Seven members ("monitors") of the NTSRS steering committee participated in data retrieval from the medical records. Behind a secure web login, the monitors entered data directly into the VDB via predefined categories in dropdown lists or checkboxes. The EMRs were accessed at each surgical unit through a computer with access to the data server of the health care region. With the aim of achieving a fair and common assessment of the information in the medical records and analogous registering of data in the VDB, a monitor's manual was constructed.
The medical records were only accessed after obtaining written consent and a signed agreement with the head of the department at each surgical unit. The VDB also allowed additional information to be registered, with the aim of better understanding possible inconsistencies between the NTSRS and the medical records. One example when additional information was valuable was if a patient reported an "admission due to bleeding" to the NTSRS without being admitted according to the medical records. In such cases, the additional information might explain why the patient had misinterpreted the question. Such information could be an admission due to pain or that the patient stayed for overnight observation the first night after surgery due to unexpected peroperative bleeding. Another example was a patient that reported a yes to the NTSRS question "Was another surgery performed due to bleeding" without any corresponding information being found in the EMRs. If such a patient had information in the EMR of a surgical procedure performed in local anaesthesia this information would have been recorded in the VDB. This additional information will be presented in the results section, as appropriate. Mandatory variables (= no missing data) in the NTSRS are date of surgery, age, gender, indication and type of surgery. Data for all these variables must be entered into the data base to create a valid entry. In the same way age, gender, dates and codes for type of surgery and indication (for all visits and hospitalizations) are mandatory in the EMRs. For the EMRs this means that the only variable where missing data is possible is surgical technique. Naturally, the absence of a registered complication in the EMR is regarded as a non-event and not as missing data.
The registration and collection of data from the medical records were performed in a two-step process. When all registrations were completed, a first mismatch analysis was performed. All variables in the study were nominal scale and thus could only match or not match. All monitors were notified (by e-mail) of the mismatching data at the unit that they had monitored, and each monitor received a data file from the VDB with the mismatching data points. The monitor then double-checked the medical records to ensure that the registration in the VDB was correct. The monitors were at all times blinded to the actual data in the NTSRS. The VDB was stored at the data server of Centre of Registers Västra Götaland and accessible to the monitors via a personal and secure web log-in. The NTSRS data base was stored in the same data server as the VDB but NTSRS data was only accessible to the statisticians. All medical records were electronic, stored at the data server of the respective health care region, and at all surgical units accessible to the monitors via a personal and secure at site log-in.

Statistical analysis
The distributions of variables are given as numbers and percentages for categorical variables and as the mean and standard deviation (SD) and median (min-max) for continuous variables.
Criterion validity refers to the analysis of agreement (as described below) between two independent data sources, with the medical records considered as non-reference standard. Criterion validity was further explored by the careful descriptive assessment of mis-matching datapoints in source data (including the full medical notes). The results of this meticulous evaluation of discrepant data were reported in descriptive text to provide a good understanding of potential weaknesses in criterion validity of the NTSRS variables.
The criterion validity, reflected by the agreement between the data in the NTSRS and the medical records was analysed in terms of observed agreement, Cohen's kappa, Gwet's AC 1 [18], and positive and negative agreement [19]. As many of the variables in the NTSRS have a skewed trait distribution, the kappa is at risk of yielding artificially low coefficients. Under such circumstances, the AC 1 is a better estimate of the agreement than Cohen's kappa, as AC 1 is not influenced by unbalanced trait prevalence. The kappa and AC 1 coefficients were interpreted as follows: ≤0.20, slight agreement; 0.21-0.40, fair agreement; 0.41-0.60, moderate agreement; 0.61-0.80, substantial agreement; and ≥ 0.81, almost perfect agreement. As we couldn't assume that the medical records, used for comparison with the NTSRS, were always correct we had to consider medical record data as a non-reference standard [19]. Therefore, the estimates were called positive-and negative agreements even though they were calculated with the same formula as sensitivity and specificity [19]. Systematic differences between the NTSRS and the medical records were analysed with the sign test. The sign test is a conditional test that only considers the non-match registrations of the NTSRS and the medical records. No adjustments were made for multiple comparisons.
Rates were compared between groups by Fisher's exact test for dichotomous variables and the chi-square test for non-ordered categorical variables. Fisher's non-parametric permutation test was used for continuous variables. All significance tests were two-sided and conducted with a 5% significance level. SAS software version 9.4 (SAS Institute, Cary, NC, USA) was used for all statistical analyses.

Results
The intention was to include 2200 patients in the project. After the exclusion criterion was applied, 1991 patients remained in the study population. For comparison, 11,226 patients were registered in the NTSRS in 2019, of which 1514 patients were included in the study population. Thus, the study population accounted for 13,5% of the NTSRS 2019 cohort. The composition of the whole study population was largely comparable to that of the 2019 NTSRS cohort (Table 2).
Due to the non-mandatory nature of some of the variables in the NTSRS, missing data occurred. Another reason for missing data was lack of information in the medical records. The numbers and percentages of missing data points for the non-mandatory variables in the NTSRS and EMRs are presented in Table 3.
The date of surgery differed by less than 7 days in 98% of the registrations and by more than 21 days in only 1%. All other variables showed very high observed agreement ranging from 0.91 to 1.00, and all variables had AC 1 values corresponding to almost perfect agreement ( Table 4).
The kappa values, as expected due to the skewed trait distributions, showed much greater variability ( Table 4). Seven of the variables (where 4 were different indications) had positive agreement < 0.80, while all variables had negative agreement ≥0.92 (Table 4).
Regarding the indication, 177 (9%) non-matches were observed. Seventy-one of these had multiple indications for surgery according to the medical records, including the indication registered in the NTSRS. Sixty-six patients had a specific non-match between recurrent and chronic tonsillitis. Thus, 40 out of 1991 (2.0%) of the patients had an unexplainable non-match regarding the indication for surgery. The most common problem encountered for the variable level of care (2.4% mismatch) was that planned outpatient surgery that, according to the medical records, was converted to inpatient surgery was registered as outpatient surgery in the NTSRS.
The observed agreement for surgical technique was 95%, and the kappa and AC 1 showed almost perfect agreement. Nevertheless, there was a systematic significant difference between the NTSRS and the medical records, with 73 cold surgeries registered in the NTSRS that could not be verified in the medical records.
Postoperative bleeding showed high observed agreement, a low kappa but a high AC 1 , and low positive agreement. Only 9 postoperative bleeding events were recorded in both the NTSRS and medical records, while 12 were recorded in the NTSRS only and 7 in the medical records only.
Although the observed agreement for contact due to bleeding was high, the kappa showed good agreement, and the AC 1 showed almost perfect agreement, there were still some important inconsistencies. Seven patients had, according to the medical records, been in contact with the healthcare system without reporting this in the NTSRS questionnaire. On the other hand, 16 patients reported, in the questionnaire, contact that could not be found in the medical records. The medical records revealed that of these 16 patients, two suffered postoperative bleeding that required some type of intervention before discharge from the primary surgery, and nine had documented contact either due to pain or another unspecified reason after discharge from the index surgery.
For admission due to bleeding, the observed agreement was 99%, and both the kappa and the AC 1 showed almost perfect agreement. Only one patient had a registered admission due to bleeding without reporting this in the NTSRS 30-day PROM questionnaire. Nine patients reported admission due to bleeding without corresponding data in the medical records. According to the medical records, one patient had visited the emergency department with a bleeding throat but chose to leave the hospital even though admission was recommended for observation. Another patient had an outpatient visit due to bleeding but was not admitted overnight. One patient experienced postoperative bleeding before discharge after the index surgery, underwent reoperation due to the bleeding and was observed overnight. One patient was observed overnight after a planned outpatient surgery due to anaesthesia-related problems. Two patients were readmitted due to pain after discharge from the primary surgery. Three patients had no information on any contact or complications registered in the medical records. The fact that these 9 patients reported admission due to bleeding in the NTSRS without being admitted due to bleeding according to the medical records resulted in a significantly higher rate in the NTSRS than in the medical records (p = 0.021).
Regarding reoperation due to bleeding, both the observed agreement and the AC 1 were very high. The kappa was lower but still showed substantial agreement. Nine patients had matching data regarding reoperation, but 7 patients reported reoperation in the NTSRS 30-day questionnaire without supporting information in the medical records. Of these 7 patients who did not have a recorded reoperation, two underwent reoperation under full anaesthesia before they were discharged after the index surgery. Another two underwent intervention with the use of bipolar diathermy under local anaesthesia after discharge from the primary surgery. This variable showed low positive agreement and a statistically significant   difference between the NTSRS and the medical records (p = 0.016), with a higher rate in the NTSRS. The variable contact due to pain had the lowest observed agreement, kappa and AC 1 of all the 30-day PROM questionnaire variables. Of the non-matches, 31 patients had a contact in their medical records that was not reported in the NTSRS questionnaire. On the other hand, 65 patients reported a contact that could not be found in the medical records. Of these, 30 patients had some kind of contact recorded in their medical record not interpreted by the monitor as primarily due to pain but instead due to other postoperative problems. The positive agreement was low, and there was a statistically significant difference between the NTSRS and the medical records (p = 0.0007), with a higher rate in the NTSRS.

Comparison of responders and non-responders to the 30-day NTSRS PROM questionnaire
The analysis was based on 1037 responders and 954 nonresponders. This means that the response rate in the studied cohort was 52.1%. For comparison the response rate in the total NTSRS cohort for 2019 was 50.9%. No statistically significant difference was found in terms of age, indication, type of surgery or level of care (Table 5). There was a statistically significant difference in the sex distribution (p = 0.0022), with a higher proportion of women in the responder group (Table 5).
Regarding the rates of contact due to bleeding, readmission due to bleeding, and contact due to pain, no differences were found between responders and nonresponders to the 30-day PROM questionnaire. There was a significantly higher rate of patients who underwent reoperation due to bleeding in the responder group (Table 5).

Discussion
With few exceptions, this study shows that the data in the NTSRS have criterion validity. This is a prerequisite if quality register data is to be used to monitor and improve medical care [12,17]. Based on the AC 1 values, it can be concluded that the agreement was almost perfect for all variables (range 0.84 to 1.00). The kappa values showed almost perfect or substantial agreement for 16 out of 20 variables. Furthermore, the analysis of responders and non-responders to the 30-day questionnaire showed that women were more prone to respond than men. There were no differences in outcome between patients who did and did not respond with the exception that responders had a significantly higher rate of reoperation due to bleeding. This study establishes that the NTSRS is suited for monitoring clinical practices and outcomes of tonsil surgery. Thus, the registry can be used in both clinical improvement projects and research.
The results in this study are in accordance with those of a similar study from the Norwegian tonsil surgery register [15]. The Norwegian study was a single-centre study conducted on the first 137 consecutive patients ever registered in that registry, while the present study was a much larger (1991 patients) multicentre study. It could be argued that the good results shown in the Norwegian study, at least in part, could be due to dedicated surgeons in a pioneering clinic. It is therefore reassuring that the present study on the NTSRS, operational for almost 25 years, yielded basically the same good results as the Norwegian study.
In both the present study and the Norwegian study, most of the discrepancies regarding the variable indication for surgery were found for recurrent tonsillitis and chronic tonsillitis. There are at least two possible explanations. First, there is no code for recurrent tonsillitis in the International Classification of Disorders (ICD) codebook. While the NTSRS allows the surgeon to separate these two indications, the medical records based on the ICD only allow chronic tonsillitis. This may entice the surgeon to register chronic tonsillitis in the medical records even though the patient suffers from recurrent tonsillitis. Second, there is no generally accepted definition for either of these two indications, and there is also a probable overlap between the conditions. It would be appropriate for future versions of the ICD to include a specific code for recurrent tonsillitis, as it is a common condition encountered in clinical practice. Studies on the clinical characteristics of these types of infectious tonsillitis are needed to determine the potential differences.
The surgical technique showed substantial or almost perfect agreement (kappa, 0.82; AC 1 , 0.93), which indicates that the data regarding surgical techniques in the NTSRS is valid. However, it is important to note that 5.1% of the registrations were non-matches. The most common non-matches (3.9%) were NTSRS cold technique registrations that were contradicted by the medical records. This means that the positive effects of cold techniques found on bleeding complications have not been exaggerated in NTSRS data [7,8]. As many NTSRS-based clinical improvement projects have been (and hopefully will be) conducted with the aim of increasing the number of surgeries performed using cold techniques, this problem needs to be communicated with surgeons to ensure that registrations in the NTSRS are correct.
The variable postoperative bleeding, which is recorded by the surgeon, is intended to capture bleeding that occurs after extubation but before the patient is discharged from the index surgery. In total, 28 such events were recorded in the NTSRS, the medical records or both (Table 4). It is undoubtedly problematic that only 9 were found in both. The reason why 7 events were recorded only in the medical records is probably that the surgeon registers the perioperative data immediately after surgery, while bleeding often occurs somewhat later. Capturing these events in such cases would require the surgeon to return to and change the registration. It is more puzzling why 12 events were captured only in the NTSRS. One explanation could be that surgeons are not aware of the NTSRS definition of postoperative bleeding. According to the NTSRS instructions, only postoperative bleeding that occurs after the patient is extubated should be registered. It is possible that some surgeons register unexpected bleeding occurring before the patient is extubated. Such events would not be identified in the medical records by the monitor and not registered in the VDB. It must be acknowledged that this is an NTSRS variable that in its present form must be interpreted with caution due to suboptimal criterion validity.
Although the statistical analysis showed almost perfect agreement (AC 1 ) for all 30-day PROM variables, there were still some notable discrepancies between the NTSRS and the medical records. Unlike the perioperative form where the surgeon can learn about the definitions on the NTSRS webpage, no such information is available to the patients, a circumstance that may increase the risk of misunderstandings and affect criterion validity. The questions in the PROM questionnaire were constructed in 2009 with the intention of collecting data on very specific events. The definitions of these events (as presented in Table 1) have never been publicly shared (e.g., with patients). The variables admission due to bleeding and reoperation due to bleeding can be taken as examples, as both showed statistically significant differences in rates between the NTSRS and the medical records. Nine patients reported admission without being admitted, and 7 patients reported reoperation without undergoing reoperation. The majority of these patients had a complication but not the complication that the question was intended to capture. The consequence is that the NTSRS is at risk of overestimating the rates of these complications. One explanation for this over-reporting may be that patients have problems understanding the questions or cannot find a question that asks about their specific experience. The results must be taken seriously as they indicate suboptimal criterion validity. By cross-tabulation and statistical analysis of different alternative interpretations of the PROM variables, it was ensured that no alternative interpretation of the questions resulted in better agreement.
One of the 30-day PROM variables stood out with only moderate agreement according to the kappa statistic: contact due to pain. The major reason for this result is probably a limitation of the study methodology and design. Sweden does not have a national medical record system, and patients may have contacted their general practitioner or the web-and telephone-accessible National Health Care guide (www. 1177. se) instead of the operating clinic. Such contact would not be recorded in the medical records that were accessible to the monitors. Differences in the routine to register telephone contact (i.e., some centres may not register such contact in their medical records) at the surgical centres might also be a factor. It is therefore possible that this variable suffers from false low agreement due to missing data. This also means that the difference in rates between the NTSRS and the medical records should be interpreted with caution.
Comparison of the responders and non-responders showed that women were more prone to respond to the questionnaire than men. There was also a significant difference between the responders and non-responders in the variable reoperation due to bleeding, with a higher rate of reoperation in the responder group. One possible explanation is that reoperation is so traumatizing that it triggers the patient to respond to the questionnaire. In contrast, no such differences were found in contact due to bleeding, contact due to pain or readmission due to bleeding. The conclusion is that even though the response rate for the 30-day NTSRS questionnaire is slightly above 50%, the rates can be considered representative for the whole population with the exception of reoperation due to bleeding.
There are many quality aspects on quality registers [3,12,17] and the NTSRS is in a good position to fulfil many of these. The NTSRS have a high coverage and completeness [1], and it is unique as one of only two national tonsil surgery quality registers. It must also be considered consistent as it has been in use for more than a decade without major changes in the most important variables, allowing for longitudinal analysis. The results of this study now adds that the NTSRS data is valid. The NTSRS also fulfils the quality criterion of usefulness; through the register website surgical centres can benchmark best practice and engage in competition to achieve best practice. The outcome measures, both complications and efficacy are relevant both to the patients and the medical care system as they can serve as markers for value in health care [4].

Strengths and limitations
The multicentre design ensures that the whole spectra of surgical units in Sweden are represented. The large number of included patients indicates that the study sample does not differ in any essential aspect from the NTSRS, and the sample size is well above the number needed for generalizability.
In both the NTSRS and the medical records, data were provided by hundreds of different surgeons (the same clinics and clinicians report both to the NTSRS and the medical records). For the medical records, data were also transferred by the monitors into the VDB. These circumstances imply a risk for affecting data quality. Several conditions can on good grounds be assumed to have counteracted this potential weakness. The surgeons that reported to the register had access to the registry guidelines through the NTSRS website, and the carefully constructed monitor manual aimed to ensure that the data collection from the medical records and the registration in the VDB were performed in a similar manner by all the monitors.
The study design has some limitations that probably negatively affected the results. As mentioned before, the monitors only had access to the medical records of the surgical unit and ENT units in the catchment area (i.e. health care region). There are several other health care services (non ENT) available for counselling and guidance, with medical records that were not available to the monitors. Also, patients that had to seek medical care due to a complication while travelling outside their health care region during the postoperative convalescence would also be missed.
Another limitation of this study (and the NTSRS in general) regards timeliness. In quality register methodology timeliness refers to the time lapse from an event to the registration of that event in the data base. The longer the time lapse, the greater the risk that incorrect data is entered. There are at least three problems with timeliness that may negatively affect the data quality in the NTSRS and EMRs and thereby also the results of this study. First, there is no de facto dead-line for the surgeons when it comes to register data in either the EMR or NTSRS. Based on our experience and knowledge however, we assume that most registrations in both the NTSRS and EMRs are made within minutes or hours after the surgery. Second, the NTSRS database does not contain data regarding time from surgery to the registration in the NTSRS meaning that the impact of timeliness cannot be evaluated in this study. Third, the PROM questionnaire analysed in this study is sent out 30 days after surgery, and the patient is expected to give correct answers regarding events that may have occurred several weeks earlier. We acknowledge that NTSRS data collection can be improved regarding timeliness, and the most effective strategy would be to link data directly from medical records to the NTSRS. Another alternative would be to link data from another health register such as the NPR. At present, the NPR cannot be used as it is only updated once a year. Other weaknesses with both EMRs and the NPR are that many of the NTSRS variables (including PROM data) are not registered in either EMRs or the NPR.

Conclusion
This study establishes that the NTSRS is well suited for monitoring clinical practices and outcomes of tonsil surgery and that the registry can be used in both clinical improvement projects and research. The study also shows that even though approximately half of the patients did not respond to the 30-day PROM questionnaire, the data can be considered representative of the whole population. However, it cannot be ignored that this study shows that there is room for improvement. The careful analysis of each variable indicated that patients sometimes did not seem to understand the questions the way they were intended. This indicates that criterion validity of the NTSRS variables can be improved, for example by providing both surgeons and patients with better information. In the future, the questionnaire could be improved with explanatory text for each question. Additionally, an increased number of questions might be beneficial for capturing negative events not covered by the current questionnaire.